The performance of large neural networks can be judged not only by theirstorage capacity but also by the time required for learning. A polynomiallearning algorithm with learning time $\sim N^2$ in a network with $N$ unitsmight be practical whereas a learning time $\sim e^N$ would allow rather smallnetworks only. The question of absolute storage capacity $\alpha_c$ andcapacity for polynomial learning rules $\alpha_p$ is discussed for severalfeed-forward architectures, the perceptron, the binary perceptron, thecommittee machine and a perceptron with fixed weights in the first layer andadaptive weights in the second layer. The analysis is based partially ondynamic mean field theory which is valid for $N\to\infty$. Especially for thecommittee machine a value $\alpha_p$ considerably lower than the capacitypredicted by replica theory or simulations is found. This discrepancy isresolved by new simulations investigating the learning time dependence andrevealing subtleties in the definition of the capacity.
展开▼
机译:大型神经网络的性能不仅可以通过它们的存储能力来判断,还可以通过学习所需的时间来判断。在单位为$ N $的网络中,学习时间为$ \ sim N ^ 2 $的多项式学习算法可能是可行的,而学习时间$ \ sim e ^ N $只能允许使用较小的网络。讨论了几种前馈架构,感知器,二进制感知器,委员会机器和第一层中具有固定权重的感知器,以及自适应权重在多项式中的绝对存储容量$ \ alpha_c $和多项式学习规则$ \ alpha_p $的容量的问题。第二层。该分析部分基于动态平均场理论,该理论对$ N \ to \ infty $有效。特别是对于委员会机器,发现$ \ alpha_p $的值大大低于复制理论或模拟所预测的容量。通过研究学习时间依赖性和揭示能力定义中的细微差别的新模拟解决了这种差异。
展开▼